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Abstract 

While for single processor and SMP machines, memory is the allocatable 
quantity, for machines made up of large amounts of parallel computing units, 
each with its own local memory, the allocatable quantity is a single computing 
unit. Where virtual address management is used to keep memory coherent 
and allow allocation of more than physical memory is actually available, 
virtual communication channel references can be used to make computing 
units stay connected across allocation and swapping. 
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Parallel Architecture 



For various reasons, an alternate design to SMP based parallel computing for 
use with dynamic applications is assumed to be implemented: Large num- 
bers of computing units, each composed of a processing unit and local memory 
[1] . To allow computing units to cooperate, they shall be connected by some 
network of comminucation channels. Each computing unit being programmed 
much the same as MMU-less micro controllers, the full network is understood 
as a parallel computing system in the sense of communicating sequential pro- 
cesses [2] . The single computing units should not differ in connectivity and 
amount of local memory. 



Resource Allocation and Usage 

On a single processor or SMP machine, system global memory is the main 
resource to administer, and it usually is portioned in memory pages (figure 1). 
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On a parallel computing system however, the computing units are the main 
resource, already portioned in units as is: Running an application will make 
use of an arbitrary, not necessarily fixed, number of computing units (fig- 
ure 2). 



figure 2 
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I.e., memory is never a passive resource to allocate, but always served in 
conjunction with processing units. It is not accessed thru memory addresses, 
but thru communication channels, which in turn are addressed using a num- 
ber for each channel end, the channel end address. For computing unit "A" to 
access some computing unit "B" , it configures its channel end "a" to commu- 
nicate to channel end "b" at computing unit "B" and subsequently transmits 
data over the channel. The network layer of the channel is managed automat- 
ically by some interconnect node hardware, a good real example is provided 
by XMOS [3] . 
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Resources Demand versus Availability 



To cope with the need of apphcation programmes for larger amounts of mem- 
ory than actually are available on a system, and to avoid fragmentation, vir- 
tual address translation has been introduced for single processor systems in 
19770 Through virtual address translation, the main resource used by some 
application (virtual memory) is mapped to the main resource offered by the 
machine (physical memory, figure 3). 



figure 3 
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Now, with computing units being the main resource on a parallel computing 
system, them being referenced by the channel end addresses they offer, we 
introduce a differentiation between virtual channel end addresses, as used by 
applications, and physical channel end addresses, as needed by the intercon- 
nect node hardware. This scheme asks for an equivalent to address translation 
tables, some channel end address translation means. 



Channel End Address Translation 

Comparing this channel end address translation to conventional memory ad- 
dress translation, channel end address translation tables have to be provided. 
Different approaches lend itself to implement such tables: 

• Explicit address translation is provided by some dedicated computing 
unit 

• Implicit address translation is performed automatically by some single 
facility at a single central location. When establishing a connection, 
the initiating computing unit automatically requests the channel end 
address translation at the central facility to find the physical destination 

• Implicit address translation is performed automatically, but is dis- 
tributed, e.g. equally to the interconnect nodes. When establishing a 

^ VAX by Digital Equipment Corporation 



3 



connection, the initiating computing unit "A" automatically requests 
the channel end address translation at the responsible interconnect node 
(next to computing unit "T" ) to find the physical destination, comput- 
ing unit "B" (figure 4) 
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figure 4 
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In fact, there are real examples of similar systems. One is the domain name 
system (DNS) that translates node names into IP addresses. Here, from the 
application point of view, translation is accomplished explicitely. 

Implementation 

With automatic route establishing already given, and to avoid performance 
drop, channel end address translation should be implemented as an automatic 
feature, too. To avoid congestion at a central facility, and because virtual ad- 
dresses may be chosen without regard to the numbering, address translation 
tables, and thus virtual addresses, shall be provided per interconnect node. 
E.g., the upper part of the channel end address may be used to determine 
the interconnect node responsible, the lower part of the address being the 
virtual address part, which, by using it to index the translation table at the 
interconnect node next to computing unit "T", is translated to a full phys- 
ical channel end address. This physical channel end address is returned to 
the establishing computing unit for further establishing the connection to the 
physical destination, computing unit "B". 

Whether it is favourable to keep translation results in local caches for 
repeated use by the establishing computing unit, is subject to research. From 
the point of view of system simplicity, caches should be avoided altogether. 

Channel end address translation may fail. On a system that supports ex- 
ception handling, an appropriate exception handler might be triggered on 
computing unit "A" , or on interconnect node "T" . Again, for reasons of sim- 
plicity, it may be desirable to avoid exceptions altogether. To achieve this, 
the translation facility should allow for configuration to send a message, some 
exception signal, to a dedicated computing unit, which in turn is responsible 
for handling the failure by loading or swapping appropriate code. 
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